FILE WRAPPER FOR PROVISIONAL U.S. APPLICATION 



NO: 

INVENTORS: 

FILING DATE: 
TITLE: 



60/119,210 

GREGORY A. STOBBS 
JOHN V. BIERNACKI 

FEBRUARY 5. 1999 

COMPUTER-IMPLEMENTED PATENT PORTFOLIO ANALYSIS 
METHOD AND APPARATUS 



'RELATED U.S. APPLICATION DATA: 

USSN 10/806,307 FILED MARCH 22, 2004 
PENDING 

PATENT APPLICATION PUBLICATION 2004/018 1427 Al 

USSN 09/499.238 FILED FEBRUARY 7, 2000 
PENDING 

US PROVISIONAL APPLICATION SERIAL NO. 60/1 19.210 
FILED FEBRUARY 5. 1999 [Captioned File] 

"The related U.S. application data is drawn from the USPTO's public website and is not to be construed as a 
complete family of applications. Complete family information is available from the USPTO under 37 CFR §1.14. 



NOTES: 

■ THIS FILE HISTORY COPY IS AN "ASIS - " COPY PROVIDED BY THE USPTO AS 
PART OF THE PHOENIX APPLICATION MANAGEMENT SYSTEM - IMAGE 
FILE WRAPPERS (IFW). THE TRADITIONAL COVER AND CONTENTS PAGES 
ARE NOT PRESENT IN IFW FILES. A COPY OF THE PAIR CONTENTS IS 
ENCLOSED. 







1. Application 


02-05-1999 


2. Initial Exam Team nn 


02-12-1999 


3. IFW Scan & PACR Auto Security Review 


02-22-1999 


4. Application Dispatched from OIPE 


03-011999 


5. Set Application Status 


09-21-2001 



5 0 

- u> 

[ia 



Harness, Dickey & Pierce, P.L.C. 

ATTORNEYS AND COUNSELORS 
P O BOX 826 
BLOOM FIELD HILLS, MICHIGAN 48303 
U.S A. 



1^ Date: February 5, 1999 

[ «-j 

? fW Commissioner of Patents and Trademarks 
Washington, D.C. 20231 
ATTN: Box Provisional Patent Application 

Re: Title: Computer-Implemented Patent Portfolio Analysis 
Method and Apparatus 
Atty. Docket: None 

Sir: 



TELEPHONE 
(248) 641-1600 

TELEFACSIMILE 
(248) 641-0270 



O 



0 



01 



This is a request for filing a provisional patent application. Pursuant to 37 C.F.R. 1.51(c), the 
following information and documents are provided: 

1. The names and addresses of the inventors: 

First Inventor: Gregory A. Stobbs 



Residence: 971 Charrinqton. Bloomfield Hills. Michigan 48301 



ss s 



Second Inventor: John V. Biernacki 



Residence: 2912 Ravine Drive, Apt. #306, Lake Orion. Michigan 48360 



PJ 

I 3 - 



A specification having 21 pages. 

[X] 1 sheet of drawings showing Figure 1. 

[ ] This invention was made by an agency of the United States Government or under a 
contract with an agency of the United States Government under contract number 



5. [X] A Verified Statement Claiming Small Entity Status is enclosed. 

6a. [X] A check is enclosed to cover the fees as calculated below. The Commissioner is 
hereby authorized to charge any additional fees which may be required, or credit any 
overpayment to Deposit Account No. 08-0750. A duplicate copy of this document is 
enclosed. 

6b. [ ] The fees calculated below will be paid within the time allotted for completion of the 
filing requirements. 

6c. [ ] The fees calculated below are to be charged to Deposit Account No. 08-0750. The 
Commissioner is hereby authorized to charge any additional fees which may be 
required, or credit any overpayment to said Deposit Account. A duplicate copy of this 
document is enclosed. 



Date: February 5, 1999 



FILING FEE CALCULATION - BASIC FEE 


$150.00 


FILING FEE - NON-SMALL ENTITY 


FILING FEE - SMALL ENTITY: Reduction by 1/2 

A Verified Statement is enclosed. 


$75.00 


Assignment Recordal Fee ($40.00) 


TOTAL 


$75.00 





7. [ ] An Assignment of the invention is enclosed. The required cover sheet under 37 

C.F.R. §3.11, §3.28 and §3.41 is attached. 

8. [ ] Because the enclosed application is in a non-English language, a verified English 

translation for examination purposes of same [ ] is enclosed [ ] will be filed within 
the allotted time period. 

9. [X] An Express Mailing Certificate is enclosed. 



10. [ ] Other 



11. Please direct all correspondence and telephone calls relative to this application to the 
undersigned at the following address; 

HARNESS, DICKEY & PIERCE, P.L.C. 
P. O. Box 828 

Bloomfield Hills, Michigan 48303 
(248) 641-1600 



If, for some reason, Applicant(s) has/have not paid a sufficient fee, please charge our 
Deposit Account No. 08-0750 for any further fees which may be due or credit any overpayment 
to Deposit Account No. 08-0750. A duplicate copy of this document is enclosed. 



Respectfully, 

Gregory A. ^tdbbs J 
Reg. No. 28764 



COMPUTER-IMPLEMENTED PATENT PORTFOLIO ANALYSIS METHOD AND 

APPARATUS 

Inventors: 

Gregory A. Stobbs 
971 Charrington 
Bloomfield Hills, Ml 48301 
Citizenship: United States 

John V. Biernacki 
2912 Ravine Drive 
Apt. #306 

Lake Orion, Ml 48360 
Citizenship: United States 

Background of the Invention 

The present invention relates generally to a computer-implemented system for 
analyzing patents. More particularly, the present invention relates to a computer- 
implemented system for analyzing patents with linguistic and other computer techniques. 

Description of the Preferred Embodiment 

Figure 1 depicts a comprehensive computer-implemented patent portfolio 
analysis system. Linguistic analysis techniques are combined with other techniques in 
order to categorize and/or analyze a plurality of patents or patent applications. In order 
to achieve a higher quality of associating patents with proper categories, the preferred 
embodiment of the present invention utilizes a multi-tiered approach. 

A linguistic analysis engine (1) produces clusters of patents which have been 
grouped according to linguistic similarity. Linguistic analysis engine may examine one or 
more of the following sections of a patent in order to determine which patents are similar 
based upon linguistic analysis: claims; abstract; summary; preferred embodiment; 



and/or background of the invention. In the preferred embodiment, linguistic analysis 
engine examines the claims and abstracts of the patents. 

Linguistic analysis engine uses one or more of the following types of linguistic 
engines: a word or words engine; a core word engine; and an eigenvector analysis 
engine. A word analysis engine examines whether patents have similar types of words 
in common. A word analysis engine preferably utilizes a thesaurus in order to more 
flexibly determine that a group of patents utilizes similar words. For example, but not 
limited to, a word analysis engine may have within its thesaurus as approximate 
synonyms the terms memory and storage. 

Core word analysis engine produces clusters based upon predetermined patent 
sections containing similar word roots. For example, but not limited to, with a first patent 
containing the word "fastener and a second patent containing the word "fasten", the 
core word analysis engine determines that these two words contain the same root word 
fasten and clusters the two patents based upon the two patents sharing a certain 
number of root words. 

An eigenvector analysis engine produces clusters based upon an alternate 
technique. The alternate technique for forming coarse claim clusters employs a 
dimensionality reduction process that yields a plurality of eigenvectors that represent 
the claim space occupied by a plurality of patent claims that have already been labeled 
as belonging to a known cluster or category group. The technique works as follows. 

A corpus of training claims is assembled to represent the entire claim with which 
the patent portfolio analyzer is intended to operate. The training claims can be selected 
from actual patents, or they may be drafted specifically for the training operation. Each 
claim in the training corpus may be labeled according to the user's preassigned cluster 
categories. Later, when the eigenvector system is used, uncategorized claims are 
projected in the eigenspace and associated with the closest training claim within the 



eigenspace. In this way, the uncategorized claim may be assigned to the category of its 
closest categorized neighbor. 

To construct the eigenspace we first form supervectors representing 
distinguishing features of a claim using a predefined format. The predefined format, 
itself, is not critical. Any suitable format maybe used provided that such format is used 
consistently for all claims in the training corpus and all claims later being categorized by 
eigenspace projection. 

In one form, the supervector for each claim may consist of a one dimensional 
array of integer values, where each integer corresponds to one word in the claim. The 
array of integers may be indexed in the order that the words appear in the claim. Integer 
numbers may be assigned to words by first forming a dictionary of all words found in the 
training corpus, deleting any noise words (such as articles or short prepositions), 
alphabetizing the dictionary and then sequentially assigning integer numbers. 

In this embodiment, a predefined maximum array size may be established, so 
that the supervectors for all claims will have the same number of array elements. Claims 
having fewer words than the maximum array size are handled by inserting a null 
character in each array element that does not contain a word integer. Claims that 
exceed the maximum array size are truncated at the maximum array size, using the final 
element of the array as a flag to indicate overflow. A suitable overflow character may be 
selected for this purpose. 

Alternatively, a supervector may be constructed by defining a one dimensional 
array of size equal to the number of words in the claim language dictionary. The array is 
then populated by integer numbers indicating the number of times each word appears in 
the claim. This will, of course, result in an array that is populated by many zeroes as 
most claims do not use all words in the claim dictionary. 



The above two alternative supervector configurations produce fairly large 
structures. However, these large structures are reduced in forming the eigenspace to a 
set of eigenvectors equal in number to the number of claims used in the training corpus. 
Although this dimensionality reduction step is computationally expensive, it only needs to 
be performed once to define the eigenspace. 

A third alternate embodiment employs a supervector that is based on a 
preprocessing step whereby each claim is reduced to its component parts of speech 
using a natural language parser. The resulting tree structure may then be 
parameterized and stored as elements of the supervector, along with the respective 
word integers occupying each node of the tree. In effect, parsing the claim produces 
something similar to a grammatical sentence diagram in which the relationships and 
grammatical function of sentence fragments and phrases are revealed. 

After supervectors have been generated for each of the training claims, a 
suitable dimensionality reduction process is performed on the supervectors. Principal 
component analysis is one such dimensionality reduction process. There are others. 
Dimensionality reduction results in a set of eigenvectors, equal in number to the number 
of claims in the training corpus. These eigenvectors define an eigenspace that 
represents the claim scope occupied by the elective members of the training corpus. 
The eigenspace is an n-dimensional space (n being the number of claims in the training 
corpus). Each of the n dimensions is defined by the dimensionality reduction process 
(e.g. principle component analysis) to maximally distinguish claims from each other. 

After the eigenspace has been constructed, each claim in the training corpus 
may be projected into that space by performing the same dimensionality reduction 
process upon the supervector for that one claim. This places each claim as a point 
within the n-dimensional eigenspace. Each point may be labeled with its corresponding 
cluster or category designation. Thus regions within eigenspace near a given labeled 



point represent subject matter that is likely to be similar to the subject matter of the claim ' 
that defined the given point. 

After the eigenspace is constructed and all known points have been placed into 
that space and labeled, the system may be used to analyze uncategorized claims. This 
is done using the same procedure that was used to place categorized claim into the 
eigenspace. Thus the uncategorized claim is processed to generate its supervector and 
that supervector is dimensionality reduced (e.g. through principle component analysis) 
and placed into the eigenspace. Next, a searching algorithm explores each of the 
labeled points in close proximity to the newly placed point to determine which is the 
closest. A geometric distance (in the n-dimensional space) may be used to determine 
proximity. If the newly projected claim is within a predefined proximity of the closed 
training claim point, it may be assigned to the cluster or category of the training claim. If 
the newly projected point is outside a predefined threshold from its closest neighbor, 
suggesting that the new claim is not all that similar to the existing claims, then the new 
claim is not assigned to the closest neighbors category. Rather, the new point is treated 
as a new cluster within the eigenspace. After the system has been used for a while, the 
user may manually examine the content of new clusters, giving them labels that may be 
subsequently used for further claim processing. 

Linguistic analysis engine produces coarse patent clusters based upon utilizing 
one or more of the aforementioned engines. Moreover, the term coarse in "coarse" 
patent clusters is utilized within the present invention to designate that the patent 
clusters produced from linguistic analysis engine is subsequently refined by subsequent 
processes according to the teachings of the present invention. 

Linguistic analysis engine can in an alternate embodiment use not only the 
aforementioned linguistic engines but also separately or in concert with the 
aforementioned linguistic engines a claim meaning analysis engine (2). A claim 



meaning analysis engine examines one or more claims of a patent in order to determine * 
the meaning or semantics of the claim. For example, but not limited to, claim meaning 
analysis engine examines the words contained within a "wherein" or "whereby" claim 
clause in order to partially or wholly determine the meaning or gist of a claim. Moreover, 
a claim's preamble can be examined to determine claim meaning, as well as using claim 
element position to determine claim meaning since typically claim elements which 
appear later in a claim contain the more important components. Also, if file history data 
is available electronically, then responses to office actions can be examined to 
determine what claim limitations were most important in order to make a patent 
distinguishable over the prior art. Claim meaning analysis engine can use one or more 
of these aspects (e.g., wherein analysis, preamble analysis, etc.) in order to best 
determine the meaning of a claim. Each of these aspects can be weighted to make one 
aspect more predominant in determining the meaning of a claim. 

Claim meaning analysis engine can utilize a linguistic tagger software in order to 
identify parts of speech in a claim such as identifying a "wherein" or a "whereby" clause 
as well as relative purpose clauses (which clauses can be used to determine a chief 
purpose for one or more elements of a claim). One linguistic tagger software package is 
obtainable from such sources, but not limited to, the Xtag software package from the 
University of Pennsylvania. 

Moreover, an expert system can be used alone or in concert with linguistic tagger 
software in order to determine the meaning of a claim. The expert system includes claim 
meaning expert rules in order to identify the meaning of the claim. For example, a claim 
meaning expert rule includes a larger weighting factor being applied to a phrase which 
is: part of a wherein clause and the wherein clause appears in the last portion of the 
claim. 



Another exemplary non-limiting claim meaning expert rule is where a claim 
element utilizes similar words to the words which appear in a claim's preamble. The 
expert system would more heavily weight such a claim element since a claim element 
which discusses the goal of the preamble is more likely to be an important element. 

Claim meaning analysis engine also includes in an alternate embodiment a 
neural network being utilized either alone or in concert with linguistic tagger software 
and/or expert system in order to determine meaning of a claim. The neural network is 
preferably a multi-tiered neural network with hidden layers whose weights have been 
adjusted due to training. Training includes processing a predetermined number of 
patent claims and/or patent abstracts through a multi-tiered hidden layer neural network 
and adjusting the weights based upon how well the neural network has determined the 
meaning of the claim. 

Claim meaning analysis engine provides the meaning of each claim of a patent to 
linguistic analysis engine so that linguistic analysis engine can use one or more of its 
engines to produce coarse patent clusters. Moreover, in still another alternate 
embodiment of the present invention, claim meaning analysis engine produces its own 
coarse patent clusters based upon which patent claims have similar meanings. 

The preferred embodiment of the present invention includes a patent 
classification engine (3). Patent classification engine is utilized by the present invention 
preferably in combination with linguistic analysis engine and claim meaning analysis 
engine in order to determine with high fidelity which patents belong in the same cluster. 
Patent classification engine examines the United States Patent classification of a patent 
relative to the classification of another patent or relative to a predetermined classification 
in order to determine whether the first patent should be placed in the same cluster as 
another patent. Patent classification engine examines this relationship by determining 
the degree of relatedness between two United States patent classifications. For 



example, a cluster of patents will be obtained for those patents which are only five "class 
steps" away from each or from a predetermined classification. Within the present 
invention, the term class step refers to the tree-like structure of the United States patent 
classification wherein a parent-child relationship within such a classification system 
would constitute one class step. 

In an alternate embodiment, the search notes produced by the United States 
Patent Office are used to determine which classifications relate to one another. 

A refined cluster generator (4) produces refined patent clusters based upon the 
coarse patent clusters which are available from one or more of the aforementioned 
engines. Refined cluster generator produces refined patent clusters based upon a 
relationship among the linguistic clusters, the clusters from the classification degree of 
relatedness, and clusters from the patent claim meaning engine. Refined cluster 
generator utilizes in the preferred embodiment a factor approach wherein different 
weights are attributed to each of these different types of clusters. For example, linguistic 
clusters may be weighted with a higher factor value than a cluster from the patent claim 
meaning engine. These factor values allow clusters from different types of engines to be 
utilized according to how well the engine can cluster for the application at hand. 

Moreover, the present invention in the preferred embodiment utilizes factor 
values within the clusters from the linguistic analysis engine. For example, linguistic 
analysis engine produces a score for each patent on how well a patent fits within a 
particular cluster. A factor value is preferably used to indicate how well that patent fits 
within a linguistic cluster. An exemplary factor approach includes a factor value of 1 
being given to a patent whose cluster score indicates an excellent fit within the cluster. 
A factor value of 0.75 is associated with a patent with only a good cluster score, A factor 
value of 0.5 is associated with the patent which has only an average cluster score. A 
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factor value of 0.25 is associated with a patent with a below average cluster score and a * 
factor value of 0 is associated with a patent whose cluster score is extremely poor. 

Refined cluster generator is able to produce a more refined patent cluster than 
any of the engines since refined cluster generator produces clusters based upon more 
information than is available to any one engine. Refined cluster generator provides the 
refined patent clusters to patent category engine (5). 

Patent category engine (5) associates each refined patent cluster with a 
category. A category may already exist, for example, through a client previously 
providing certain categories. The present invention also includes dynamically 
determining the categories, for example, by using the United Stated patent classification 
titles which are found for each patent within a particular cluster. Moreover, categories 
may be dynamically determined by examining the key core words or key words 
associated with a cluster produced from linguistic analysis engine and/or claim meaning 
analysis engine. 

In an alternate embodiment, both predetermined categories and dynamically 
determined categories are utilized since the predetermined categories may not address 
all of the clusters. 

Patent portfolio analysis engine (6) receives the categorized refined patent 
clusters from patent category engine. Patent portfolio analysis engine examines the 
patents in each cluster by determining, for example, how one assignee's patents have 
clustered in each category with respect to a second assignee's patents. In the preferred 
embodiment, patent portfolio analysis engine includes a patent portfolio comparison 
analysis engine in order to perform that function. 

Patent portfolio analysis engine preferably includes a claim breadth analysis 
engine in order to analyze the breadth of each patent claim. Claim breadth is important 
for example, for determining which patents are the broadest and hence more likely to be 



infringed. Claim breadth analysis engine in one embodiment examines the number of 
words of a claim in order to provide an indication of how broad a claim is. In the 
preferred embodiment, an adjusted claim length is utilized wherein the number of words 
in a claim's preamble is accorded less weight. Preferably, claim breadth analysis engine 
reduces the total number of words in a claim by half of the number of words in a claim's 
preamble. 

Claim breadth analysis engine in an alternate embodiment includes clusters 
which in a Cartesian graphical format represent clusters with a centerpoint and a varying 
or non-varying radius about that centerpoint which represents the cluster's patents which 
are the furthest distance on a linguistic basis from the cluster's center point. The 
present invention examines the average length of the cluster based upon this Cartesian 
representation in order to determine claim breadth. Both the average length of the 
cluster and the adjusted word count are utilized in the preferred embodiment to 
determine which claims are the broadest. 

Moreover, a database of patents (7) is provided which has United States patent 
information and foreign (e.g., PCT) patent and foreign (e.g., PCT) patent application 
information. The database of patents is utilized to identify which patents are the most 
"important" since there is a relationship between importance of a patent and in how 
many countries a patent has been filed. 

In an alternate embodiment of the present invention, patent portfolio analysis 
engine is utilized without the clustering technique and is utilized primarily only through 
the database of patents. This alternate embodiment is utilized typically when patent 
portfolio analysis is performed without clustering. This may be done when only claim 
breadth analysis without categorization is satisfactory for the application at hand. 

A filter is used in order to reduce the number of "noise" patents which are 
identified as the result of key word patent searching. The filter identifies high fidelity and 
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low fidelity patents by constructing high fidelity search strings to obtain high fidelity * 
patents and place them into one portion of the patent database. A lower fidelity search 
strategy is run to obtain lower fidelity patents and place them into a separate portion of 
the database. The lower fidelity patents then can be examined on a more individual 
basis within the database to determine whether the patents belong in the patent portfolio 
analysis. 

For example, a high fidelity search string includes United States patent 
classifications whose patents are probably all high fidelity. Moreover, a high fidelity 
search string may include an assignee where it is already known that all patents of that 
assignee are highly relevant. As shown on Figure 1 1 the engines which produce the 
coarse patent clusters use as input the filtered patents from the filter. However, it is to 
be understood that the present invention also includes not providing filtered patents to 
the engines. For example, the engines can examine the entire universe of patents or the 
engines can examine the patents of particular assignees. 

Using the information from patent category engine and from the database of 
patents, patent portfolio engine produces in the preferred embodiment the following 
types of reports: claim breadth analysis reports; patent portfolio comparison reports; and 
patent clearance reports. Claim breadth analysis reports indicate such items as the 
client's broadest claims which may be the best candidates for which patents a competitor 
is most likely to infringe. Also this report can indicate the client's longest (i.e. narrowest) 
claims which are probably the best candidates to discontinue to pay maintenance fee 
payments. Moreover, claim breadth analysis reports may indicate the competitor's 
shortest claims which may be the best candidates for which patents the client is most 
likely to infringe. 

Patent portfolio comparison reports include a comparison of the number of 
client's and competitor's patents for each category on: a raw total number basis; and a 

n 



difference number basis. Also this report includes a time trend analysis whereby for 
each year in a predetermined time interval the number of patents of a client and of a 
competitor is examined for each category. 

Patent clearance reports assist a patent attorney in a freedom-to-practice study 
since patent clearance reports obtain relevant patents for the study which have been 
processed by the filter and which are sorted by United States patent classification so that 
the patent attorney can more quickly examine the claims of each of the relevant patents. 
Moreover, patent clearance reports can be sorted by claim breadth so that the shortest 
claims (which are more likely to be broader) are examined first. 

Example 

A core word linguistic software engine grouped patents into clusters based 
upon patent claims and abstracts. However, it should be understood that the 
present invention is not limited to only clustering on patent claims or patent 
abstracts but can cluster on any part of the patent. Moreover, two different 
clustering approaches were used. The first approach was to have patents 
assigned to one or more clusters. The second approach assigned patents to the 
one cluster with which the patent was most strongly associated. 

The core word linguistic software engine produced two files: a clustered 
patents file and a core word keywords cluster file. A clustered patents file 
contained: cluster number, cluster score patent number, assignee, patent title. 
Patents are clustered based upon claim or abstract text The table below shows 
an example of a clustered patent file. 
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Cluster" v* 
Number ! ; ;S * 


Cluster Score 


*Tatent Number - 


Assignee 




1 


16.3 


5122976 


Assignee A 


Method and apparatus for remotely 
controlling sensor processing 
algorithms to expert sensor diagnoses 


1 


37.8 


5107497 


Assignee B 


Technique for producing an expert 
system for system fault diagnosis 



A second file contains core word keywords cluster file. The cluster's 
keywords are used to categorize each cluster. The fields of the second file 
preferably include: cluster number and key words. The table below shows an 
example of core word keywords in a cluster file. 



Cluster Number^ > 


Keywords * \ 


1 


exper diagn compute store faul fail syst data address receive share retrieve 



An initial set of categories is generated for each cluster. Since many clusters 
may be generated by the linguistic analysis engine, more general categories are 
preferably established to more easily analyze and portray the patent portfolio results. In 
the preferred embodiment, the linguistic analysis engine is able to vary the number of 
clusters for a group of patents. The resulting cluster-to-category mapping can be a 
many to one relationship since several clusters may be mapped to one category. For 
example, clusters 1, 8, 110 and 133 may all be mapped to a general category of "(A) 
Computer Heuristic Algorithms". Moreover if a large number of clusters exist, then 
preferably the categories may be arranged in an hierarchy so that an user can select 
what level of detail is most fitting for the application at hand. For example, a general 
category of "(A) Computer Heuristic Algorithms" decompose into other categories of 
"(A.1) fuzzy logic", "(A2) neural networks", etc. If needed, these categories may in turn 
decompose into still more detailed categories. 
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An inheritance principle exists between a parent and child category in that cluster 
numbers, factor values, and patent counts of a child category are automatically inherited 
for a parent category. For example, parent category B may have children categories B.1 
and B.2. Child category B.1 has five patents with a particular factor breakdown and child 
category B.2 has seven patents with a particular factor breakdown. Parent category B 
would include the twelve patents with the cluster numbers and factor values of its 
children as well as any patents, cluster numbers, and factor values which parent 
category B itself has. 

Since Patents have been assigned to each cluster, the titles and the United 
States Patent Office Classification titles for the Patents are used to categorize a cluster. 
Accordingly, an initial set of categories is developed based upon a brief review of the 
patents (usually the patent titles and the U.S. Patent Office Classification titles) and the 
cluster's keywords. 

It should be understood that the present invention includes a patent being placed 
in one or more clusters depending upon the linguistic algorithm used. For example, an 
expert system patent used to detect failures may be placed in both of the following 
clusters: a duster which is directed to expert systems in general; and a cluster which 
includes computer-related approaches for detecting failures (whether they be expert 
system approaches or another failure detection approach, such as through a threshold 
detection approach or through a neural network approach). 

Below are two clusters and how they were assigned to categories: 



Cluster Num^ji 


> y Key Teriiis * , : 


Categorj' te C« 


i 1 


exper diagn compute store faul fail syst data 
address receive share retrieve 


(A. 1) Fuzzy Logic 


! » 


neur diagn netw compute weig store faul fail syst 
data address nod share retrieve 


(A.2) Neural Network 
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A factor value is determined which indicates how well a patent fits within a 
cluster. Each Patent has a "cluster score" which indicates how strongly did a patent fit 
within the keywords of a cluster. For example, patent 5,122,976 has a cluster score of 
16.3 for Cluster #1. Patent 5,107,497 has a cluster score of 37.8 for Cluster #1. The 
higher cluster score indicates that patent 5,107,497 Tits" better with the keywords of 
Cluster #1 than the first Patent. 

A factor value is utilized to indicate the fact that the second patent fits more 
closely with the keywords of Cluster #1 than the first patent. The following factor values 
are used: 



Cluster Score 


Factor Value 


Cluster Score > 30 


1 


20 < Cluster Score < 30 


.75 


10 < Cluster Score < 20 


.5 


0< Cluster Score < 10 


.25 


Cluster Score = 0 


0 



Each patent in each cluster is associated with the appropriate factor value based 
upon its cluster score. 

If it is desired to determine how many patents an assignee has in each category, 
then the factor values are summed for each assignee in each category. The following 
table shows an example of a factor value breakdown of cluster number 1 for each 
Assignee for category A.1 (note that the other duster numbers are omitted below for 
easier viewing of the table): 











A.1 


Fuzzy Logic 


Assignee A ' 0.5 1 


15 


A.1 


Fuzzy Logic 


Assignee B 1 1 


A.1 


Fuzzy Logic 


Assignee B 1 1 


30 


A.1 


Fuzzy Logic 


Assignee B 1 1 


37 


A.1 


Fuzzy Logic 
Fuzzy Logic 


Assignee B « 0.75 1 


28 


! A.1 


Assignee B 0.75 1 


25 
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A. 1 


Fuzzy Logic 


Assignee B 


I 


: 1 


.33 


A.1 J 


Fuzzy Logic 


Assignee B 


0.75 


1 


26 


A.l 


Fuzzy Logic 


Assignee B ; 


i 


l 


32 



The factor sum for Assignee A for Cluster #1 (which is assigned with other 
Clusters to Category A.1) = 0.5. The factor sum for Assignee B for Cluster #1 (which is 
assigned with other Clusters to Category A.1) is 7.25. 
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The present invention can graph the results which were obtained using the 
"Factor Approach". The Summed Factor Values for each Assignee and for each 
Category are graphed side-by-side. The 18.75 value indicates that Assignee A has 
approximately 19 Fuzzy Logic Patents while Assignee B has approximately 27 Fuzzy 
Logic Patents. 

Also, the "difference" between the Assignees' Factor Values were determined 
and graphed. For example, the difference between the Assignees' Factor Values for the 
"Fuzzy Logic" Category was "18.75-26.5" or U -7J5\ The -7.75 value indicates that 
Assignee B has approximately 8 more Fuzzy Logic patents than Assignee A. Through 
use of the present invention, the relative patent portfolio metric produces a more 
accurate assessment of how Assignee A stands with respect to other assignees. This 
may be due to any biases which enter into the algorithm on an absolute basis being 
cancelled when a relative comparison (or delta) is performed among the assignees' 
portfolios. 

It is to be understood that the present invention is not limited to only examining 
two assignees, but includes comparing more than two assignees' patent portfolios. 
Moreover, it is to be understood that the present invention examines patents 
independent of assignee. 

Bar graphs are produced that depict how many patents each Assignee has per 
category. Also, bar graphs are produced that depict the difference in the number of 
patents between two assignees for each category. 

The present invention can also graph the results not using the "Factor Approach". 
The number of patents that each Assignee had within each Category can be graphed. 
Moreover, the "difference" between the Assignees' number of patents for a particular 
category can be graphed. 



The graphs can also show a time trend. The number of patents per category per 
assignee can be graphed on a yearly basis to indicate the growth status for the number 
of patents of a particular assignee. 

The present invention can also depict the breadth of a claim by a claim breadth 
number. The claim breadth number for each independent claim is determined based 
upon the number of words that a claim contained. Since the preamble typically contains 
fewer restrictions upon a claim's breadth, the claim breadth number was reduced by the 
half the number of words within the preamble. 




For example, Assignee A's Patent 5,122,976 (entitled "Method and apparatus for 
remotely controlling sensor processing algorithms to expert sensor diagnoses") had a 
claim breadth number of "39" for its Claim 1 and an adjusted claim breadth number of 
"37" (since the rounded up value of "three words divided by two" yielded a value of two): 



Patent No. 



4 0>*- ' ' 



Claim Text 



It*. 



Unadjusted 
Breadth % 
No.<* 



^Adjusted 



5122976 II. An apparatus, comprising: 

; control means for sampling sensor data and performing * 
j sensor data processing: and J 

diagnostic means for diagnosing a sensor malfunction 
using the sensor data, and said control means performing the 
sensor data processing responsive to the diagnosis. : 

5 1 07497 ; 1 . A method of forming a knowledge base in a computer for i 
producing an expert system for diagnosing a predetermined ' 
arrangement of a system to determine if the system contains a 
fault, said system comprising a plurality of components 
having respective predetermined failure rates, the method 
comprising the steps of: 

(a) decomposing the system into groups of sequential and 
•parallel subsystems, each of said subsystems comprising at 
least one of said components; 

;(b) generating a tree structure of the groups of step (a) by 
attaching nodes to each parallel and sequential link between 
subsystems in the tree to provide a tree configuration of sets ! 
of components suspected of being faulty and possible choice 
5 measurement sets: ; 
'(c) computing a lower bound cost of a sequence of tests for 
each of the parallel and sequential subsystems using a first 
( rule that (1) if a node is a parallel node, then the lower bound 
( cost for that node is computed by 

sorting numerically and in a Fust predetermined order a 
, first list P of the failure rates of the components of each 
! subsystem, 

;(ii) sorting numerically and in a second predetermined order a 
i second list L of test costs of the components of each 
'subsystem, and 

|(iii) for corresponding elements in lists P and L, computing a 
[product of each of the corresponding elements, and (2) a 
; second rule that if the node is a sequential node, then the 
! lower bound cost of the sequence of test cases for that node is 
'computed by 

|(i) separately sorting numerically and in a predetennined 
• order each of the failure rate and the test cost for each 
^ component of each subsystem in the first and second lists P 
t and L, respectively, 
(ii) initializing a variable h to zero, 
'(iii) selecting the lowest valued two numbers p.sub.l and 
jp.sub.2 from the list P, 

!(ivj^ ccMrg^tii^^CTiriCTt value forj failure rate p by 



39 



402 



37 



377 



.i 



90 



I summing p.sub.l and p.sub.2 
j(v) selecting a first member c from list L, 
(vi) summing the current value of h with the product of the 
jvalue of p.sub.l and p.sub.2 from step (iv), and placing such i 
! sum for the current value for h, I 
|(vii) inserting the current value of p in numerical order in list : 
|P, and 

:(viii) repeating steps (iii) to (vii) until p=l; and 
j(d) generating a diagnostic knowledge base for generating a 
j diagnostic fault testing sequence at an output of the 
•computer. 
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Patent 5,107,497 on the other hand has a relatively high Adjusted Claim Breadth 
number, and if, for example, the purpose of the patent portfolio analysis is to determine 
which patents of the client are candidates for not maintaining through payment of 
maintenance fees, then this patent is a likely candidate due to its tendency to be too 
narrow to provide adequate protection for the client. 

The preferred embodiment counts the words in a claim by counting the blank 
spaces (that is ASCII code 32) in the claim. This approach helps accelerate processing 
since the database may include hundreds of thousands of claims. The preferred 
approach also only examines the claim breadth of independent claims. 

While the invention has been described in its presently preferred embodiments, it 
will be understood that the invention is capable of certain modification without departing 
from the spirit of the invention. 
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